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Abstract 



The No Child Left Behind Act imposes sanctions on schools if the fraction of 
each of five racial group of students demonstrating proficiency on a high stakes exam 
falls below a statewide pass rate. This system places pressure on school administrators to 
redirect educational resources from groups of students most likely to demonstrate 
proficiency towards those who are marginally below proficient. Using statewide 
observations of 3rd and 4th grade math tests, this paper demonstrates that students of 
successful racial groups at schools likely to be sanctioned gain less academically over 
their subsequent test year than comparable peers at passing schools. This effect is 
stronger at schools more likely to suffer from NCLB sanctions and is robust to non- 
random sample selection. 




Demands for school accountability and concerns about racial performance disparities 
culminated in the No Child Left Behind Act (NCLB), the 2002 reauthorization and expansion of 
the Elementary and Secondary School Act. The NCLB holds districts and buildings accountable 
for student performance on state administered high-stakes tests, sanctions failing schools, and 
provides expanded educational opportunities for students attending these schools. Proponents of 
the NCLB hope it will increase educational quality and reduce the racial and income academic 
achievement gaps. However, the implementation of the NCLB also provides incentives to 
reduce academic achievement for some groups of students. This article describes these 
incentives and documents a reduction in scholastic performance among these groups. 

The NCLB institutes a system of performance goals that, if not met, trigger sanctions of 
increasing severity on schools and districts. Yet, as Ladd (2001) suggests, any performance- 
based system suffers from a number of potential pitfalls. For instance, important societal 
standards not covered by performance measures are likely to receive less instructional attention. 
When performance goals are translated into empirical measures, there may be a weak connection 
between the goals and measures. For example, the presence of high-stakes exams encourages 
teaching to the content of the exams thereby improving measured achievement without 
improving educational performance. 1 

In addition to these concerns, the NCLB creates incentives for school administrators to 
focus resources on specific racial groups in the hopes of making Adequate Yearly Progress 
(AYP). As mandated by the NCLB, each school must test five distinct racial groups: Black, 
Hispanic, White, American Indian and Asian/Pacific Islander. For a school to make AYP, the 
percentage of students in each racial group within that school who demonstrate proficiency on a 

1 Jacob (2005) finds that gains made on high-stakes tests are not mirrored in low-stakes tests and the gains that are 
made on high-stakes exams appear to be due to improvements in test-specific skills. Klein, et. al. (2000) find similar 
results when comparing the Texas high-stakes test with the National Assessment of Educational Progress. 
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high-stakes test must exceed a state determined pass rate. Schools with too low a percent of 
students demonstrating proficiency in any racial group do not make AYP and are subject to 
school-wide sanctions under the NCLB. By focusing on a binary pass/no pass outcome for each 
racial group, the NCLB provides incentives for administrators to direct resources away from 
racial groups projected to make AYP and target those resources towards members of groups 
thought to be in danger of not making AYP. For instance, an administrator may choose to 
abandon a curriculum that has broad appeal for one that focuses on skills that a lower performing 
group of students lack. Administrators may assign students of weaker racial groups to stronger 
teachers in hopes of raising their high-stakes academic performance leaving students of other 
racial groups in the care of less able teachers. Administrators may choose to fund co-curricular 
activities that appeal to one particular racial group in hopes of raising their academic 
performance. Rouse, et. al. (2007) document Florida schools who failed that state’s 
accountability standards were more likely to reorganize students within classrooms into smaller 
learning “units”, were more likely to mandate a minimum class time spent on high stakes 
subjects, and were more likely to reward high teacher performance. Whatever the specific 
avenue, responding to the possibility of failure under the NCLB in this way is referred to as 
“strategic instruction” in this paper. This paper documents evidence consistent with the presence 
of strategic instruction and the extent to which it alters academic achievement for students in 
racial groups not targeted by school administrators. 

To test for the presence of strategic instruction, consider two similar students. The first is 
a member of a racial group which made AYP but attends a school that contains another racial 
group that failed to make AYP. The second is a member of the same racial group as the first but 
attends a school that had no groups fail to make AYP. If strategic instruction exists, then the first 
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student should gain less academically over the course of the subsequent year than the second 
because resources are directed away from the first student in favor of the failing racial group at 
her school. Measuring academic differences between these students suggests one method of 
identifying strategic instruction. The data employed in this paper allows for a second method. 
The data examined span the period before and after enactment of the NCLB presenting the 
ability to measure the change in the differences between these two students that occurred before 
and after the NCLB. 

The following econometric estimates are consistent with the strategic instruction 
hypothesis. Using a statewide sample of 4 th graders, it is found that students of successful racial 
groups who attend schools where another racial group fails to make AYP score lower on a 
subsequent high-stakes test than comparable students at schools without failing racial groups. 
Estimates of this impact are of similar magnitude to the test score decrease that occurs when 
students switch schools midyear and occur after controlling for general differences arising 
between failing and successful schools. Consistent with the strategic instruction hypothesis, this 
difference increases as failing schools face more severe NCLB sanctions and in schools that ex 
ante are more likely to fail to make AYP. These impacts occur despite controlling for student- 
level past standardized test performance, a host of other observable student characteristics, and 
for the racial and socio-economic makeup of schools. These findings are also robust to 
controlling for non-random sample attrition and do not appear to be the result of administrators 
targeting students by their a priori beliefs of student ability. 

A handful of researchers have investigated a form of strategic instruction based not on 
race but on student ability. Chakrabarti (2007) uses disaggregated school-level data to analyze 
the behavior response of schools threatened under Florida’s “opportunity scholarship” program. 
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This program predates the NCLB but offers similar incentives to school administrators. Under 
this program, schools failing for two out of four years must provide students with vouchers. 
Chakrabarti argues that the incentive this program creates is for administrators to focus on 
students who are marginally below the threshold required to pass Florida’s high-stakes test. 
When compared to students at similar but non-threatened schools, Chakrabarti finds that 
marginal students at threatened schools improve performance. Further, Chakrabarti argues that 
the entire test distribution moves to the right, with larger moves for marginal students. 

Burgess, Propper, Slater, and Wilson (2005) examine school accountability for secondary 
students in the United Kingdom. If strategic instruction occurs, schools with a higher proportion 
of marginal students will have a greater incentive to divert resources from students at the tails of 
the ability distribution. Indeed, these authors find that as the proportion of marginal pupils 
increases, all students lose relative to the most able, but the lowest ability group loses the most. 
One possible explanation for the relative stability of the most able students is that UK schools 
have overlapping catchment zones, leading to school competition for the best students. 

Using pre-NCLB Texas data on individual students, Reback (2007) finds that schools 
respond to the Texas accountability system with measures helping low-performing students and 
specific, targeted measures towards students that are critical to the school’s accountability 
ratings. Reback compares students within buildings and finds that those gaining most 
academically are also those who have the highest probability of increasing their school’s 
rankings. In contrast, relatively high achieving students perform worse than expected if their 
performance is unlikely to impact their school’s ratings. 

Before proceeding, two caveats are necessary. First, all prior research on strategic 
instruction in public education has focused on administrator’s a priori beliefs or observations of 
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student ability. However, arriving at these beliefs involves significantly more uncertainty and 
hence a weaker motivation to strategically instruct than is the case if administrators choose to 
target resources based upon observables as obvious as student race. As a result, if one finds 
strategic instruction based upon ability, it is likely that one will also find strategic instruction 
based upon race. However, the converse may not be true. If administrators target resources 
based upon racial characteristics, then strategic instruction may alter the relative performance of 
races rather than the relative performance of students of differing abilities. Secondly, the 
presence of strategic instruction may not result in an inefficient outcome. If, prior to the NCLB, 
schools over-expended resources on students of would-be successful racial groups, then the 
NCLB incentives discussed here may improve overall resource allocation. Further, as suggested 
by Chakrabarti (2007), if building administrators respond to the NCLB by introducing more 
effective teaching techniques, better curriculum, or a more efficient use of resources, then the 
NCLB may improve overall student learning. 

Section 2: The NCLB and Student Testing in Washington 
As part of a move towards educational accountability, the state of Washington introduced 
the Washington Assessment of Student Learning (WASL), a statewide test of reading, writing, 
listening, and mathematics in 1997. 2 The WASL is the state of Washington’s high-stakes test 
used to identify AYP under the NCLB. In the 4 th grade the WASL tests mathematics, reading 
and writing. In order to avoid complications that arise when combining scores from tests of 



2 Much of this section describes the Washington testing system as it was in place during the time which this paper 
addresses. Since that time, Washington has made some changes to its high-stakes testing system including replacing 
the listening test with a science test in 2004. The method of determining AYP based upon individual racial group 
performance has not changed and, indeed, is mandated by the NCLB A. 
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different subjects, this paper analyzes only the WASL math results which have been normalized 
to mean zero and variance equal to one within each year. 

The NCLB requires school districts to bring all students to the “proficient” level in 
reading and mathematics by the 2013-2014 school year. In the meantime, individual schools 
must meet state AYP targets toward this goal for both their overall student population as well as 
for eight socio-demographic subgroups: American Indian, Asian/Pacific Islanders, Black, 
Hispanic, White, special education, limited English, and economically disadvantaged students. 

To make AYP, the state of Washington measures the percentage of a school's students in each of 
these nine groups who demonstrate proficiency on the WASL and compares this to the state- 
imposed pass rate. For a school to make AYP, the percentage of the total student body, as well 
as the percentage of each subgroup, must be above the required pass rate. As designed, a single 
student can be a member of many groups and therefore impact a school’s ability to make AYP 
multiple times. For instance, an Asian, limited English student from an economically 
disadvantaged family would be represented in the overall student body as well as three of the 
eight demographic subgroups. If this student fails to demonstrate proficiency on the high stakes 
test, then this failure is represented in the overall calculation of percent proficient as well as the 
calculation of the three subgroups. 

Required pass rates in Washington are calculated by first determining the cumulative 
twelve-year improvement needed between 2001-02, when the NCLB was implemented, and 
2013-14, in order to have 100% of all students demonstrate proficiency at the end of this period. 
This total improvement is then evenly divided over the twelve year period. For example, in 
2001-2002, 29.7% of 4 th grade students were rated as math proficient by Washington. If this 

3 Indeed, the vast majority of schools failing to make AYP in 2004-2005 did so because of a failure to achieve the 
required pass rate in mathematics. In 2005, of the 207 Washington buildings failing to make AYP, 161 were due to 
poor math scores. 
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figure rises by 5.86 percentage points in each of the subsequent 12 school years, the goal of 
100% proficiency would be attained by 2013-2014. Thus, the mathematics required pass rate 
required to make AYP in the 2002-03 school year was 29.7%+5.86% = 35.56%. A school with 
fewer than 35.56% of their overall student body (or of any subgroup) demonstrating math 
proficiency in the 2002-2003 would be classified as not meeting AYP. 4 Finally, AYP is granted 
only if 95% of all continuously enrolled students at each grade level take the WASL. In 2008, 
43.4% of 4 th graders demonstrated proficiency in all three phases of the WASL (reading, writing, 
and math) and 53.6% demonstrated proficiency on the math portion. In that year, 50.9% of 
schools offering 4 th grade had insufficient students demonstrating proficiency to be above the 
required pass rate and hence did not make AYP. 

The NCLB prescribes specific penalties for schools receiving Title I funds failing to meet 
AYP, but it allows states to determine the structure of penalties for non-Title I schools. For 
example, in the case of Title I schools that fail to make AYP for two years in a row, students in 
the school must be allowed to transfer to a school in the same district that makes AYP. In this 
case, the NCLB requires up to 5 percent of the district’s Title I funds be used to pay for transfer 
students’ transportation. Schools failing to show improvement over three years are required to 
provide supplemental educational services including private tutoring. Those failing over a 
longer time period are required to replace teachers or administrators, and in extreme cases, incur 
the loss of local governance. This increased scope of sanctions for schools failing to make AYP 
in consecutive years is later used to test the presence of strategic instruction. However, as Figlio 
and Lucas (2004) point out, schools performing poorly on state assessments impact not only 

4 In order to not penalize schools that begin far from the state mandated pass rate, the NCLBA created the “safe 
harbor” provision which grants AYP to schools failing to make AYP as described above but who reduce the number 
of students failing to show proficiency on the WASL by 10%. The safe harbor provision maintains the incentive for 
administrators to target the students on the margin of passing in order to show 10% gains. In the data this paper 
uses, 7 schools offering 4 th grade achieved AYP through the safe harbor provision. This represents .57% the state’s 
elementary schools and .56% of 4 th graders. 
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themselves but also their communities through diminished property values. Thus, schools face 
considerable pressure to improve measured performance on high stakes tests. 

In addition to the WASL, Washington students take the Iowa Test of Basic Skills (ITBS). 
The Iowa tests are standardized exams identifying a student’s academic level. The ITBS is given 
in Washington near the end of the student’s 3 grade year, the year immediately prior to the 
WASL. Using the ITBS math results presents a number of advantages. First, since the ITBS is 
not employed as a tool to determine AYP, it is unlikely to be the direct focus of strategic 
instruction. Instead it may be a tool used by administrators who decide how to allocate resources 
across students. Secondly, since the ITBS is given the year previous to the WASL, it can be 
used as a proxy for student ability. As such, this paper compares the academic progress of 
students from the time of taking the ITBS to their completion of the WASL. Another advantage 
conveyed with the ITBS data is the large number of demographic, social-economic, and 
academic variables observed. These variables are used as explanatory variables in later 
regressions. Unlike the WASL scores, a student’s ITBS is measured as a percentile relative to all 
nationwide 3 ld graders taking the ITBS. 

Optimally, a researcher would compare schools under the NCLB with those that were not 
impacted by the NCLB to test if strategic instruction took place. But, since all public schools in 
Washington are subject either to the NCLB, state-level sanctions tied to NCLB, or both, there is 
no direct control group with which to compare strategic instruction practices. However, as 
suggested by Rouse, et. al. (2007), schools having failed to make AYP in prior years are more 
likely to change instruction strategies in future years in order to avoid the increasing sanctions 
for failing AYP. Further, both the ITBS and WASL have been given in Washington since the 
mid-1990s, which creates the possibility of a before-and-after identification strategy. If the 



9 




NCLB creates strategic instructional behavior, then differential WASL outcomes should be 
found among schools under the threat of sanctions and should be present only after the NCLB 
was enacted. 



Section 3: Data and Descriptive Statistics 

The data used in this article consist of four cohorts of paired observations of ITBS/WASL 
scores for third/fourth graders. The first observed cohorts of third graders took the ITBS in the 
spring of 2001 and the WASL in the spring of 2002. The final observed cohort took the ITBS in 
2004 and the WASL in 2005. The state of Washington did not define AYP until late in the 
spring of 2002 and only notified schools of their AYP status after the subsequent school year 
commenced. Hence the first two cohorts began the school year in which they took the WASL 
before their building administrator knew their building’s AYP status. Administrators had little 
opportunity to pursue strategic instruction for these cohorts. Students in the final two cohorts 
began their WASL the year after schools knew their AYP status so administrators had the ability 
to pursue strategic instruction for these students. The heterogeneity between these two sets of 
cohorts offers one opportunity for identifying the impact of the NCLB. 

As a first attempt to investigate strategic instruction, the final two cohorts are 
examined — the cohorts who took the WASL in buildings where the principal knew their AYP 
status from the preceding year. After excluding special education students and those with 
missing observations, the pooled number of student observations in the last two cohorts is 
1 12,485. This represents 74.8 percent of all Washington public 4 th grade students and 85.7 
percent of all non-special education students. Panel A of Table 1 divides this cohort into two 
groups: students at schools who made AYP in the previous year and students who are members 
of a race that made AYP in the previous year but who attend buildings which did not make AYP 
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because another racial group failed. For example, this second group includes Hispanic students 
if Hispanics at their building made AYP but Whites did not. In this example, Panel A of Table 1 
would not include those White students. 

Panel A of Table 1 demonstrates a number of important features. First, a significant 
difference in WASL and ITBS performance occurs between passing students at AYP schools and 
passing students at failing schools. On average, students at AYP schools score .085 standard 
deviations above the state WASL mean and average just above the 62 nd percentile on the ITBS. 
Members of a successful racial group at a failing school score .27 standard deviations below the 
average on the WASL and at the 56 th ITBS percentile. Since Table 1 includes only those 
students in racial groups that made AYP, this difference is not caused by inclusion of failing 
groups of students. Rather, the difference likely arise from any number of factors . 5 For 
instance, AYP schools have roughly half the free/reduced lunch population relative to non- AYP 
schools and students of successful racial groups at non- AYP schools are much more likely to be 
minorities than those at AYP schools. 

If strategic instruction occurs, passing students at failing schools will make smaller gains 
between their ITBS test year and their WASL year than do observationally equivalent students at 
AYP schools. Panel A of Table 1 suggests that this may be the case. Students at failing schools 
averaged at the 56 th percentile on the 3 rd grade ITBS. Those at passing schools averaged at the 
62 nd percentile, a statistically significant but relatively small difference in performance. In their 
fourth grade year, students at failing schools averaged .27 standard deviations below the WASL 
average (the 34 th WASL percentile), those at passing schools averaged almost .1 standard 
deviation above average (the 49 th WASL percentile). This large difference in WASL 
performance relative to ITBS performance suggests significant academic improvement of 
5 See Krieg and Storer (2006). 
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students at passing schools relative to passing students at schools that failed. Of course these 
differences could be explained by a number of competing hypotheses other than strategic 
instruction. For instance, the difference between a 56 th and 62 nd ITBS percentile student may be 
large in terms of academic competency making it difficult for schools to transform low ITBS 
students into high WASL scorers. Schools that pass all their students may use better teaching 
techniques which account for the increased performance of all their students. Or, passing 
schools may have a different composition of students that makes the school successful. To help 
distinguish between these possibilities, consider Panel B of Table 1. Panel B presents descriptive 
statistics for the two cohorts that were observed prior to Washington schools knowing their AYP 
status. These two cohorts are separated into two groups: students at buildings that will make 
AYP in the future and students of a racial group that will make AYP in the future who attend 
schools who will not make AYP. Thus, Panel B simply presents the same schools and racial 
groups as does Panel A but does so for the two years prior to NCLB. 

Contrasting students at AYP schools before and after the imposition of NCLB (the first 
columns of Panels A and B ) demonstrates little difference between passing schools before and 
after the NCLB. Before and after differences in WASL and ITBS scores are small as are student 
and building demographic measures. Of course, little difference is to be expected in school 
performance if those schools expect to make AYP and do not alter their instructional practices. 
However, the relative performance of students of a passing race at a failing schools appears 
higher prior to the NCLB. These students averaged at the 54 th percentile on the ITBS, a small 
difference from their post-NCLB average of the 56 th percentile. However, these students scored 
.21 standard deviations below the WASL average (the 44 th WASL percentile); a rather large 
improvement over the -.27 standard deviation (the 34 th percentile) performance expected of 
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similar students after the NCLB. This decrease of relative performance is explored more 
systematically in the next section. 



Section 4: Econometric Evidence 

The preceding descriptive statistics suggest that the gains made for students in a passing 
racial group at a failing school were larger prior to the NCLB than after it. These statistics also 
suggest the following econometric approach to exploring this further. Consider the regression: 

L 

(1) WASL itb =p 0 +^4> i ITBSi b +aAYPFAIL tb +yAYPFAILRACE itb 

j=i 

+ A.NCLBA, +v|/B tb + kX itb + £ itb 

where WASL lth is student i’s test score during time period t in building b. X;* is a matrix of 
student-specific control variables and Bbt represent a matrix of time-varying building control 
variables. 6 AYPFAIL is a binary variable equaling one if one of the five racial groups at the 
building failed to make AYP in the previous year. Since no buildings failed to make AYP during 
the first two observed years, AYP is equal to zero for all of these observations. NCLB is a 
binary variable equaling one for the last two observed cohorts (those cohorts who took the 
WASL after full NCLB implementation). Because of the non-linear relationship between tests 
measured on a percentile basis and those measured in standard deviations around the mean, 
WASL test scores are assumed to be a polynomial function of ITBS scores where the degree of 
polynomial, L, whose value is determined by minimizing the AIC. 



6 The student control variables include nine binaries representing ethnicity, five binaries representing the duration of 
student enrollment in the school, four binaries indicating their frequency of reading for fun, six binaries indicating 
their amount of daily television watched, three binaries indicating the frequency of speaking English at home, 
gender, the amount of computer usage at school, the presence of a computer at home, and if they skipped or were 
held back a grade. The building control variables include the percentage of student body in each of the five NCLBA 
racial groups, the percent of students receiving free or reduced lunches, the average building enrollment and its 
square, and five binary variables indicating the building type (traditional elementary, comprehensive, parent 
partnership program, internet/computer school, or alternative school). 
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The variable of interest in equation (1) is AYPFAILRACE which equals one for students 
of a racial group which made AYP in the previous year and who attend a school that failed to 
make AYP because of the failure of another racial group. If strategic instruction occurs, then y, 
the coefficient on this variable, would be less than zero indicating that WASL performance was 
lower for students of a successful racial group who attend schools that failed because of the prior 
performance of another racial group. 

Panel A of Table 2 presents estimates of equation (1). Students of races that did not 
cause the previous AYP failure expect to score .050 standard deviations lower on the WASL 
than similarly situated students at passing schools. To put this into context, the (unreported) 
coefficient on black (relative to white) is -.146, the impact of changing schools midyear is -.048, 
of having been held back at least one grade is -.092, and of having a computer at home .060. 
Thus, a student of a passing racial group who attends a school that had another racial group fail 
is expected to have their WASL performance diminish by about the same amount as would occur 
as if he or she changed schools midyear or about the same as the difference that occurs between 
students with and without a computer at home. 

Of interest in Panel A of Table 2 is the negative coefficient associated with failing to 
make AYP. All students at schools who failed to make AYP in the prior year can be expected to 
score .044 WASL standard deviations lower than schools making AYP in the prior year. Among 
other things, this may be the result of unobserved differences in student composition, teacher 
recruitment and retention, and financial differences between passing and failing buildings. It is 
important to note that the strategic instruction finding occurs in the presence of this AYP status 
variable suggesting that there is an additional racial component to AYP failure. 
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Panel A of Table 2 also presents an estimated coefficient associated with the variable 



NCLB. Conditional expected WASL tests scores are .03 standard deviations higher after 
enactment of the NCLB — about 60% of what students of successful racial groups at failing 
schools expect to lose. Potential explanations for this improvement are many: the NCLB better 
focused resources on academics, teachers may teach to the test, or the resources associated with 
NCLB were used efficiently by school administrators. Regardless of the cause, Table 2 presents 
a picture of a change in relative racial performance among students at failing schools and a 
simultaneous small, but statistically significant, increase in overall test performance. 

If the estimate of y = -.050 is the result of administrators focusing attention on racial 
groups who previously failed to make AYP, then this difference should grow in magnitude at 
schools that have failed to make AYP in consecutive years resulting in more severe NCLB 
sanctions. In this data, it is possible to identify schools and races who have failed AYP for two 
consecutive years. Consider the regression: 

L 

(2) WASL itb =p 0 +^^ITBSi b +aAYPFAIL tb + v A YPFAILTWICE tb 

j=i 

+ yAYPFAILRACE Itb + ^AYPFAILRACETWICE ltb +LNCLBA, +\|/B bt +/.X itb +£ ibt 

where AYPFAILTWICE equals one if racial groups at the building failed for two consecutive 
years. AYPFAILRACETWICE equals one if a student is a member of a racial group who 
successfully made AYP for two consecutive years and attends a school where another racial 
group failed to make AYP for two consecutive years. If strategic instruction occurs, one would 
expect C, to be negative suggesting a further decrease in academic performance. 



7 In the case of consecutive failure, students are assigned the value of one to each of AYPFAILURERACE and 
AYPFAILURERACETWICE. Buildings are assigned the value of one to each of AYPFAIL and AYPFAILTWICE. 
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Panel B of Table 2 presents estimates of selected coefficients from equation 2. These 



results support the strategic instruction hypothesis. Students of a passing racial group at a school 
failing in the previous year expect to score .055 standard deviations lower on the WASL than 
similar students at passing schools. However, students of a passing racial group at a school that 
failed for two consecutive years can be expected to lose an additional .035 standard deviations. 
This larger decline would be expected if school administrators focused increased attention to 
needy groups of students at the expense of those that have traditionally performed adequately on 
the WASL. 



Section 5: Robustness Checks 

Tables 1 and 2 suggest that students of a successful racial group at schools that fail to 
make AYP perform worse than similarly situated students at passing schools. While this may be 
due to strategic instruction, alternative hypotheses are possible and explored in this section. 

While strategically targeting students based upon their race may be plausible, school 
administrators are privy to information that may result in a more efficient form of strategic 
instruction that the prior empirical strategy mistakenly identifies as being based upon race. As 
suggested by Reback (2008), student test history presents administrators with rough estimates of 
each student’s propensity to show proficiency on the WASL. Rather than targeting students 
based upon race, an administrator could target instructional resources using ITBS test history and 

rd 

their subsequent perceptions of individual student ability. For instance, based upon their 3 
grade ITBS score, an administrator could place students on the perceived margin of passing the 
WASL in a strong 4 th grade teacher’s classroom and place very strong and very weak students 
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o 

with less able teachers. This would result in increased learning for middle-ability students and 
lower gains for students on the tails of the ability distribution. As long as high ability students 
continue to pass the WASL, this strategy would maximize the percent of students passing the 
WASL and therefore the school’s probability of making AYP. If administrators behave this way 
and if test scores are correlated with race, then the results from Table 1 and 2 would occur not 
because of strategic instruction focusing on student race, but rather because students receiving 
decreased attention are those whose previous tests scores are perceived by administrators as 
being those requiring the least academic attention. 

rd 

One way of testing for this possibility is to interact AYPFAIL with each students’ 3 
grade ITBS score. If administrators at schools that failed to make AYP in the previous year 
direct resources away from students who scored well on the 3 grade test, then this interacted 
variable (AYPFAILxITBS) will be negative and its presence should cause the significant 
coefficients associated with AYPFAILRACE to become insignificant. To control for non- 
linearities in this potential relationship, polynomials of ITBS interacted with AYPFAIL and are 
included in: 

L M 

(3) WASL itb =P„ +Xl>,ITBS; b +25, AYPFAILxITBS;* +aAYPFAIL tb 

j=l j=l 

+ yAYPFAILRACE itb +XNCLBA, +\|/B bt +kX itb +£ ibt 
where L and M and determined by minimizing the AIC. 

Table 3 presents estimates of £, which are individually insignificantly different from zero 
(though they do jointly explain WASL results), y, the coefficient on AYPFAILRACE, remains 
negative, statistically significant, and of slightly larger magnitude than the OLS estimates of 



8 This form of strategic instruction was explored by Krieg (2008), Reback (2008), and Chakrabarti (2007) 
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equation (1). This suggests that strategic instruction is racially based and not based upon prior 
observation of student test scores. 

A second robustness check involves sorting the sample to control for school outliers. The 
possibility exists that, based upon the composition of their student bodies, some schools are so 
certain of making AYP (or so certain of failing to make AYP) that administrators face no 
incentive to perform strategic instruction. If this is the case, then the prior results may understate 
the impact of strategic instruction in those schools that perform it. On the other hand, there is 
high variance across schools in measures like free and reduced price lunch participation, 
academic achievement of teachers, student demographics and resources per pupil. If the decision 
to participate in strategic instruction is correlated with these measures, it is possible that a few 
schools acting as outliers lead to the prior findings. 

To sort the sample, consider the building-level logit regression: 

(4) PR(Y b = l) = f(i|/B b + s b ) 

where Y is equal to zero if a building failed to make AYP in either of 2004 or 2005 (the last two 
observed cohorts), and B represents the building control variables used in equations (1) through 
(3) measured in 2001. This logit can be thought of as a forecast of which schools will make 
AYP based upon their characteristics observed at the time of NCLB enactment. From this logit 
regression, predicted probabilities of a building making AYP are generated, sorted, and divided 
into the lowest, middle, and highest thirds. Using the entire sample of students, equation (1) is 
then re-estimated for each third and results are presented in Table 4. 

Table 4 presents evidence that all schools act strategically with respect to student race, 
but the predominant effects occur at schools in the lower third of the predicted probability of 
making AYP. Students of a passing race at schools in this group who failed to make AYP can 
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expect to score .062 WASL standard deviations lower than similar students at passing schools in 
this group. Schools in the middle and top thirds of the probability of making AYP have less 
evidence of strategic instruction. In both cases, passing races expect to score .021 standard 
deviations worse than comparable students however, neither of these measures are statistically 
different than zero. This pattern of findings is consistent with administrators at schools 
perceived to be in danger of failing to make AYP acting aggressively by redirecting resources 
towards racial groups that may cause the failure. Schools less likely to fail have much less 
urgency in following this course of action and a much smaller racial impact results. 

A final explanation for these findings is that the composition of students taking the 
WASL differs between AYP and non- AYP schools and this difference is not accounted for by 
the independent variables in the regressions. This concern has been addressed by a number of 
studies, especially with regard to strategic placement of students in special education programs 9 
and through strategic administrative exclusion of students most likely to fail their high stakes 
test. 10 If non-random selection of students omitted from this analysis occurs, then the results of 
the prior regressions may be biased in favor of finding strategic instruction. 

Table 5 presents counts of included and missing observations of general education 
students by year. Over the time period examined, the percentage of valid general education 
students with complete WASL and ITBS observations has remained stable suggesting that the 
NCLB did little to change the trend of missing exams. Secondly, the numbers of missing 
observations are relatively small; over the four cohorts observed less than 15% of all Washington 
general education students are missing. Unless there is a high correlation between being 
unobserved and WASL performance, this small number of missing observations is unlikely to 

9 See Figlio and Getzler (2002), Deere and Strayer (2001), Cullen and Reback (2006), and Jacob (2005). 

10 Figlio (2006) finds that during test weeks in Florida, the duration and frequence of disciplinary suspensions for 
low-performing students in grades that face high-stakes tests increases. 



19 




overturn the prior results. However, one can imagine failing schools encouraging some students 
to take the WASL while simultaneously discouraging others in hopes of making AYP. This non- 
random attrition needs exploration before making the conclusion that strategic instruction exists. 

To test for the possibility of sample selection bias, a two-stage Heckit procedure is 
employed. 11 In the first stage, a probit augments the regressors from equation (1) with the 
contemporaneous percentage change of a county’s population to estimate if a student missed the 
WASL. Because a primary reason for missing the WASL is that students move from their local 
school district, including the percentage change in the local population may help explain sample 
attrition. The second stage of the Heckit procedure adds the inverse Mills ratio from this probit 
to equation (1). Results from these two regressions are presented in Panel A of Table 6. 

Analysis of the first stage probit in Panel A of Table 6 reveals that students do miss the 
WASL systematically. Schools in counties with high population growth are more likely to enroll 
students who later miss the WASL. After the passage of the NCLB, the conditional probability 
of individuals missing the WASL declined. Further, students at schools which failed to make 
AYP in the previous year, are also less likely to miss the WASL. This may be an artifact of 
failing schools more aggressively recruiting additional test takers in hopes of improving past 
performance. Finally, students of a passing racial group at a school that failed to make AYP are 
more likely to miss the WASL. Possibly, these students do not receive the encouragement to 
take the WASL from their administrators to the same extent as those in failing racial groups. 
Whatever the reason, if these students are stronger than average test takers, the prior regression 
results would actually understate the impacts of strategic instruction. 

The second stage Heckit results in Panel A of Table 6 explore the possible biases that 
occur because of WASL attrition. Relative to the estimates of equation (1), the Heckit results 
11 See Wooldridge (2002), chapter 17 for details. 
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suggest that the impacts of racial strategic instruction are actually larger after correcting for non- 

random WASL attrition. Those students of a successful racial group are expected to score .087 

standard deviations worse than comparable students, a large, but statistically insignificant 

12 

difference when compared to the OLS estimates. 

A related non-random attrition concern exists. If high ability students are more likely to 
leave a school that recently failed to make AYP, then estimates of student performance of 
remaining students may appear lower leading to the strategic instruction conclusion. To check 
for this possibility, the interactions of AYPFAIL and cubics of student ITBS scores were 
interacted and included in the first stage probit of Panel A, Table 6. If higher ability students 
leave failing schools with higher probability than lower ability students, coefficients on these 
variables will be positive. It turns out, these coefficients are individually and jointly insignificant 
(F = .1923, p = .901) suggesting that non-random attrition by ability does not lead to the strategic 
instruction conclusion. 

A second possible selection bias is suggested by Table 5. A significant number of 3 ld 
graders miss the ITBS test. Since this test is an explanatory variable of all prior regressions, its 
omission may bias estimates of the WASL results if students miss the ITBS in an non-random 
fashion. However, since the ITBS is an explanatory variable, the Heckit procedure cannot be 
used to explore if omitting those who failed to take the ITBS biases the regression coefficients. 
To check for the importance of missing the ITBS, a second two-stage procedure is followed. 

The first stage employs the subsample with complete observations of ITBS scores and estimates 
the regression: 



12 If more able students are more likely to leave lower performing schools, then one may conclude strategic 
instruction exists when it does not. However, the first stage probit model includes ITBS scores as independent 
variables. The unreported coefficients on ITBS scores are jointly significant and negative indicating that higher 
scoring ITBS students are less likely 
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(5) ITBSibt = IXto + v|/B bt + X OWASL J ibt 

j=i 

where X and B are defined as in equation (1). Using the estimated coefficients from (5), 
predicted ITBS scores are generated only for those students with missing ITBS scores. The 
students with generated ITBS scores are then integrated into the sample and equation (1) is re- 
estimated with results presented in Panel B of Table 5. Results from this procedure are broadly 
consistent with those in Table 2; WASL scores are .060 standard deviations lower for students of 
passing racial groups at failing schools and the impact of NCLB and school AYP failure are 
similar as the prior OLS estimates. 

As a final check for sample selection bias, the preceding two analyses were merged. 
Using equation (5), ITBS scores were created for those individuals missing the ITBS and then 
integrated into the complete data set. Using these data, a second Heckit procured accounting for 
missing the WASL was estimated and results presented in Panel C of Table 5. The results of this 
were broadly similar with those of the prior Heckit model. Students of a passing race at a failing 
school are expected to score .071 standard deviations lower on the WASL than comparable 
students. Taken as a whole, it does not appear that non-random WASL nor ITBS attrition 
accounts for the racial impacts of the NCLB. 

Section 6: Discussion and Conclusions 

This article demonstrates a differential impact of the NCLB on racial groups depending 
upon their and other racial groups’ prior success on a high stakes test. Students of a successful 
racial group at a school where another racial group failed to make AYP are expected to score 
.050 standard deviations lower on Washington’s high stakes test than are similar students who 
attend a school where no racial group failed. This test difference is of similar magnitude to the 
conditional impact of switching schools midyear and the conditional differences occurring 
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between students having and not having computers at home. This finding occurs in the presence 
of individual controls for prior standardized test scores, demographic features, and individual 
student characteristics. It also occurs in the presence of building level controls for prior AYP 
passage, racial make-up, enrolment, building type, and the level of student financial need. The 
estimated impact of this disparity grows in magnitude as the building fails to make AYP in 
consecutive years and faces more significant NCLB sanctions. This finding is also stronger at 
schools that are a priori more likely to fail to make AYP. Further, this finding remains even 
after controlling for a second type of strategic instruction that may occur when building 
administrators target resources towards students based upon their prior test scores. Finally, these 
findings are robust to non-random sample attrition from the WASL and ITBS tests. Taken as a 
whole, this evidence suggests that building administrators participate in strategic instruction; that 
is, administrators focus their efforts on racial groups that have trouble making AYP. Given the 
limits on school resources, this redirection of resources towards one racial group causes a 
diminution in academic performance of students in successful racial groups. 

Two arguments may be made that these findings underestimate the true impact of the 
NCLB. First, consider a school where each racial group made AYP in the prior year but one 
group was close to failure. Because the required pass rate rises each year, should this school fail 
to increase the performance of the group that barely passed, it will fail in future years. A school 
in this position has incentives to perform strategic instruction prior to failure but, under the 
empirical strategy used in this paper, would not be identified as doing so. Thus, the estimated 
impacts of strategic instruction may understate the actual impacts of inter-race resource shifting. 
The second issue has to do with the minimum size of the racial group required to determine AYP 
failure. Under the NCLB, schools with fewer than 30 students in a demographic group 
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automatically receive AYP for that group. This requirement reduces the incentive to perform 
strategic instruction at small schools however, schools that are close to this limit may participate 
in strategic instruction because an unforeseen addition of one or two students may make that 
school accountable under the NCLB. Since these schools were automatically classified as AYP 

1 T 

schools, this research may again understate the actual impacts of strategic instruction. 

At the time of its passage, one of the stated goals of the NCLB was to eliminate the 
achievement gap between students of different races and backgrounds. The NCLB may 
accomplish this in an unintended manner by reducing the performance of children in successful 
racial groups. However, shifting of resources from students of successful racial groups to less 
successful ones is not necessarily an inefficient use of resources. If prior to the NCLB schools 
over-allocated resources towards a particular racial group (perhaps as a result of successful 
parental lobbying), then strategic instruction may result in a more efficient allocation of 
resources. Further, while this research presents evidence that the relative positions of racial 
groups has been impacted by the NCLB, it also documents higher WASL test scores after the 
enactment of the NCLB. Thus, the NCLB may have changed relative racial performance while 
simultaneously increasing overall performance. 

The implications of the current structure of the NCLB can be significant for the futures 
of schools and society. Schools which focus their attention on poorly performing racial groups 
run the risk of reducing performance in their high performing racial groups. Over a period of 
time, it is possible that these schools will find that they have inadequately prepared students in 
these groups for success on the high stakes test at later grades. In short, these schools may trade 
AYP today for their district’s middle and high schools future failure when the fourth graders 

13 Incidentally, the requirement of 30 students per demographic group introduces another potential basis of strategic 
instruction. It is possible that district administrators shift school boundaries or busing routes to purposefully keep 
individual schools from reaching the 30 student level in weaker demographic groups. 
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advance. This becomes especially important as the required pass rate increases and the 
performance of all students, even those in passing racial groups, becomes more critical in 
determining a building’s AYP. For society, it is not clear that transferring resources from one 
racial group to another is a costless endeavor. If, for instance, schools change curricula to better 
engage students in an at-risk racial group, members of that group may improve but perhaps by 
less than members of the other groups deteriorate. The gains made by one group may or may not 
compensate for the losses suffered by others. 

Simple alterations to the NCLB could prevent this type of strategic instruction and 
maintain its focus on reducing racial disparities. For instance, rather than measuring the percent 
of each racial group that passes the WASL, the NCLB could measure year-to-year average test 
gains by racial group and then require each racial group to demonstrate some appropriate amount 
of gains. Such a system would eliminate the incentive to focus on poorly performing racial 
groups at the expense of highly performing ones. 
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Table 1: Descriptive Statistics 






Panel A: Cohorts Where AYP 


Panel B: Cohorts Where AYP 






is Known 


is Unknown 




Variable 


Students at non- 


Students of race 


Students at future 


Students of race 






failing schools 


other than that 


non-failing 


other than that 








which failed 


schools 


which will fail 




WASL 


.085 


-.270 


.096 


-.209 






(.977) 


(.958) 


(.963) 


(.956) 




ITBS 


62.70 


56.06 


61.63 


54.11 






(28.40) 


(28.76) 


(28.49) 


(28.75) 




Indian 


.024 


.026 


.023 


.094 


C/D 




(.153) 


(.159) 


(.151) 


(.292) 


.2 


Asian 


.080 


.080 


.075 


.079 


o5 

> 

"S 




(.271) 


(.271) 


(.264) 


(.270) 


Black 


.052 


.057 


.052 


.049 


0) 




(.221) 


(.232) 


(.222) 


(.216) 


55 


Hispanic 


.130 

(.336) 


.114 

(.318) 


.108 

(.311) 


.092 

(.289) 




White 


.707 


.717 


.730 


.685 






(.455) 


(.450) 


(.444) 


(.464) 




English Never 


.123 


.128 


.110 


.122 






(.328) 


(.334) 


(.313) 


(.327) 




% Indian 


2.55 


4.66 


2.47 


8.96 






(5.78) 


(12.76) 


(5.00) 


(18.96) 


CZ) 


% Asian 


8.15 


4.31 


7.69 


4.20 


o 




(8.74) 


(7.14) 


(8.21) 


(6.96) 


2 

’§ 

> 


% Black 


5.57 


7.15 


5.69 


7.12 




(8.25) 


(14.31) 


(8.60) 


(14.00) 


W) 

c 


% Hispanic 


13.76 


51.51 


11.85 


45.24 


2 




(18.05) 


(24.83) 


(16.45) 


(26.71) 


*3 

op 


% 


39.35 


74.17 


37.47 


72.01 


Free/Reduced 


(22.97) 


(15.65) 


(22.70) 


(14.81) 




Enrollment per 


76.40 


86.41 


76.97 


82.74 




grade 


(26.87) 


(23.15) 


(27.59) 


(19.84) 




Number of obs. 


110,755 


1,730 


113,933 


1,820 




Number of 
schools 


287 


25 


283 


25 
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Table 2: OLS Estimates of WASL Scores 




Panel A 


X 


NCLB 


.030*** 

(.003) 


a 


AYPFAIL 


- 044 * * * 
(.008) 


Y 


AYPFAILRACE 


-.050** 

(.021) 




R- 


.577 


N 


228,238 


F 


10 




Panel B 


X 


NCFB 


.031*** 

(.003) 


a 


AYPFAIF 


-.031*** 

(.008) 


V 


AYPFAIFTWICE 


_ 090*** 
(.021) 


Y 


AYPFAIFRACE 


-.054** 

(.022) 


V 


AYPFAIFRACETWICE 


-.035* 

(.019) 




R- 


.577 


N 


228,238 


F 


10 


F-test of y = vp = 0 


4.27** 



Notes: *** {**) (*) represent statistical 



significance at the 1% {5%} and (10%) levels. 



Standard errors corrected for clustering within buildings are in parenthesis. All 
regressions contain the independent variables listed in note 6. 
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Table 3: OLS Estimates of WASL Scores 
Using Interacted Student ITBS Scores 


X 


NCLB 


.030*** 

(.003) 


a 


AYPFAIL 


_ 

(.032) 


Y 


AYPFAILRACE 


-.058*** 

(.021) 




A YPFAIFxITB S 


.0007 

(.002) 


^2 


AYPFAIFxITBS 2 


.00004 

(.00005) 


^3 


AYPFAIFxITBS' 


-.0000005 

(.0000004) 




R- 


.577 




N 


228,238 




F 


10 




M 


3 




F-test of = t, 2 = ^3 = 0 


8.64 (.000) 



Notes: *** {**} (*) represent statistical significance at the 1% {5%} and (10%) levels. 
Standard errors corrected for clustering within buildings are in parenthesis. All 
regressions contain the independent variables listed in note 6. 
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Table 4: OLS Estimates of WASL Scores by Building Likelihood 


of Making AYP 






Lowest Third 


Middle Third 


Highest Third 


X 


NCLB 


.030*** 

(.006) 


.031*** 

(.005) 


037* * * 
(.005) 


a 


AYPFAIL 


-.052*** 

(.010) 


-.034 

(.023) 


- 065*** 
(.023) 


Y 


AYPFAILRACE 


-.062*** 

(.021) 


-.021 

(.061) 


-.021 

(.073) 




R- 


.563 


.562 


.552 




N 


68,930 


73,880 


85,428 


Notes: *** {**} (*) represent statistical significance at the 1% {5%} 


and (10%) levels. 



Standard errors corrected for clustering within buildings are in parenthesis. All 
regressions contain the independent variables listed in note 6. 



Table 5: Numbers of Included and Excluded Students 


Academic 

Year 


General 

Education 


Missing ITBS 
Score 


Missing 
WASL Score 


Missing both 
ITBS & WASL 


Valid 

Observations 


2001-2002 


67,346 


4,922 (7.3%) 


3,808 (5.6%) 


829 (1.2%) 


57,787 (85.8%) 


2002-2003 


67,878 


5,323 (7.8%) 


3,944 (5.8%) 


645 (1.0%) 


57,966 (85.4%) 


2003-2004 


65,583 


4,756 (7.3%) 


3,500 (5.3%) 


667 (1.0%) 


56,660 (86.4%) 


2004-2005 


65,669 


5,124 (7.8%) 


3,855 (5.9%) 


865 (1.3%) 


55,825 (85.0%) 


Total 


266,476 


20,125 (7.5%) 


15,107 (5.7%) 


3,006(1.1%) 


228,238 (85.6%) 
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Table 6 : Estimates of WASL Scores Controlling for Non-Random Attrition of Students 






Panel A 


Panel B 


Panel C 






1 st Stage 
Probit 


2 nd Stage 
Heckit 


OLS with 
Estimated ITBS 


1 st Stage 
Probit 


2 nd Stage 
Heckit 


X 


NCLB 


. 012 ** 

(.006) 


.028*** 

(.003) 


.031*** 

(.003) 


. 010 * 

(.006) 


.026*** 

(.003) 


a 


AYPFAIL 


- 209*** 
(.019) 


.003 

(. 012 ) 


- 040*** 
(.006) 


_ 202 *** 
(.018) 


077 *** 

(.007) 


Y 


AYPFAILRACE 


.900*** 

(.037) 


-.087** 

(.041) 


-.060*** 

(.017) 


.883*** 

(.037) 


- 071*** 
(.027) 




Population 
Growth Rate 


037 *** 

(.004) 


— 


— 


.014*** 

(.004) 






Inverse Mills 
Ratio 


— 


gi i*** 

(.178) 


— 


— 


3 28*** 
(.094) 




R“ 




.575 


.628 




.632 


N 


243,345 


243,345 


248,363 


263,470 


263,470 



Notes: *** {**} (*) represent statistical significance at the 1% {5%} and (10%) levels. 
The dependent variable for the 1 st stage probit equals one if the observation did not take 
the WASL and equals zero otherwise. Standard errors corrected for clustering within 
buildings are in parenthesis. All regressions contain the independent variables listed in 
note 6 . 
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